Cherry-pick: Optimize block_reduce_warp_reduce when block size is the same as warp size #599
+43
−30
We went looking everywhere, but couldn’t find those commits.
Sometimes commits can disappear after a force-push. Head back to the latest changes here.